Crate proptest

source ·
Expand description

Proptest is a property testing framework (i.e., the QuickCheck family) inspired by the Hypothesis framework for Python. It allows to test that certain properties of your code hold for arbitrary inputs, and if a failure is found, automatically finds the minimal test case to reproduce the problem. Unlike QuickCheck, generation and shrinking is defined on a per-value basis instead of per-type, which makes it more flexible and simplifies composition.

If you have dependencies which provide QuickCheck Arbitrary implementations, see also the related proptest-quickcheck-interop crates which enables reusing those implementations with proptest.

Introduction

Property testing is a system of testing code by checking that certain properties of its output or behaviour are fulfilled for all inputs. These inputs are generated automatically, and, critically, when a failing input is found, the input is automatically reduced to a minimal test case.

Property testing is best used to compliment traditional unit testing (i.e., using specific inputs chosen by hand). Traditional tests can test specific known edge cases, simple inputs, and inputs that were known in the past to reveal bugs, whereas property tests will search for more complicated inputs that cause problems.

Getting Started

Let’s say we want to make a function that parses dates of the form YYYY-MM-DD. We’re not going to worry about validating the date, any triple of integers is fine. So let’s bang something out real quick.

fn parse_date(s: &str) -> Option<(u32, u32, u32)> {
    if 10 != s.len() { return None; }
    if "-" != &s[4..5] || "-" != &s[7..8] { return None; }

    let year = &s[0..4];
    let month = &s[6..7];
    let day = &s[8..10];

    year.parse::<u32>().ok().and_then(
        |y| month.parse::<u32>().ok().and_then(
            |m| day.parse::<u32>().ok().map(
                |d| (y, m, d))))
}

It compiles, that means it works, right? Maybe not, let’s add some tests.

#[test]
fn test_parse_date() {
    assert_eq!(None, parse_date("2017-06-1"));
    assert_eq!(None, parse_date("2017-06-170"));
    assert_eq!(None, parse_date("2017006-17"));
    assert_eq!(None, parse_date("2017-06017"));
    assert_eq!(Some((2017, 06, 17)), parse_date("2017-06-17"));
}

Tests pass, deploy to production! But now your application starts crashing, and people are upset that you moved Christmas to February. Maybe we need to be a bit more thorough.

In Cargo.toml, add

[dev-dependencies]
proptest = "0.8.7"

and at the top of main.rs or lib.rs:

#[macro_use] extern crate proptest;

Now we can add some property tests to our date parser. But how do we test the date parser for arbitrary inputs, without making another date parser in the test to validate it? We won’t need to as long as we choose our inputs and properties correctly. But before correctness, there’s actually an even simpler property to test: The function should not crash. Let’s start there.

proptest! {
    #[test]
    fn doesnt_crash(s in "\\PC*") {
        parse_date(s);
    }
}

What this does is take a literally random &String (ignore \\PC* for the moment, we’ll get back to that — if you’ve already figured it out, contain your excitement for a bit) and give it to parse_date() and then throw the output away.

When we run this, we get a bunch of scary-looking output, eventually ending with

thread 'main' panicked at 'Test failed: byte index 4 is not a char boundary; it is inside 'ௗ' (bytes 2..5) of `aAௗ0㌀0`; minimal failing input: s = "aAௗ0㌀0"
	successes: 102
	local rejects: 0
	global rejects: 0
'

If we look at the top directory after the test fails, we’ll see a new proptest-regressions directory, which contains some files corresponding to source files containing failing test cases. These are failure persistence files. The first thing we should do is add these to source control.

$ git add proptest-regressions

The next thing we should do is copy the failing case to a traditional unit test since it has exposed a bug not similar to what we’ve tested in the past.

#[test]
fn test_unicode_gibberish() {
    assert_eq!(None, parse_date("aAௗ0㌀0"));
}

Now, let’s see what happened… we forgot about UTF-8! You can’t just blindly slice strings since you could split a character, in this case that Tamil diacritic placed atop other characters in the string.

In the interest of making the code changes as small as possible, we’ll just check that the string is ASCII and reject anything that isn’t.

fn parse_date(s: &str) -> Option<(u32, u32, u32)> {
    if 10 != s.len() { return None; }

    // NEW: Ignore non-ASCII strings so we don't need to deal with Unicode.
    if !s.is_ascii() { return None; }

    if "-" != &s[4..5] || "-" != &s[7..8] { return None; }

    let year = &s[0..4];
    let month = &s[6..7];
    let day = &s[8..10];

    year.parse::<u32>().ok().and_then(
        |y| month.parse::<u32>().ok().and_then(
            |m| day.parse::<u32>().ok().map(
                |d| (y, m, d))))
}

The tests pass now! But we know there are still more problems, so let’s test more properties.

Another property we want from our code is that it parses every valid date. We can add another test to the proptest! section:

proptest! {
    // snip...

    #[test]
    fn parses_all_valid_dates(s in "[0-9]{4}-[0-9]{2}-[0-9]{2}") {
        parse_date(s).unwrap();
    }
}

The thing to the right-hand side of in is actually a regular expression, and s is chosen from strings which match it. So in our previous test, "\\PC*" was generating arbitrary strings composed of arbitrary non-control characters. Now, we generate things in the YYYY-MM-DD format.

The new test passes, so let’s move on to something else.

The final property we want to check is that the dates are actually parsed correctly. Now, we can’t do this by generating strings — we’d end up just reimplementing the date parser in the test! Instead, we start from the expected output, generate the string, and check that it gets parsed back.

proptest! {
    // snip...

    #[test]
    fn parses_date_back_to_original(y in 0u32..10000,
                                    m in 1u32..13, d in 1u32..32) {
        let (y2, m2, d2) = parse_date(
            &format!("{:04}-{:02}-{:02}", y, m, d)).unwrap();
        // prop_assert_eq! is basically the same as assert_eq!, but doesn't
        // cause a bunch of panic messages to be printed on intermediate
        // test failures. Which one to use is largely a matter of taste.
        prop_assert_eq!((y, m, d), (y2, m2, d2));
    }
}

Here, we see that besides regexes, we can use any expression which is a proptest::strategy::Strategy, in this case, integer ranges.

The test fails when we run it. Though there’s not much output this time.

thread 'main' panicked at 'Test failed: assertion failed: `(left == right)` (left: `(0, 10, 1)`, right: `(0, 0, 1)`) at examples/dateparser_v2.rs:46; minimal failing input: y = 0, m = 10, d = 1
	successes: 2
	local rejects: 0
	global rejects: 0
', examples/dateparser_v2.rs:33
note: Run with `RUST_BACKTRACE=1` for a backtrace.

The failing input is (y, m, d) = (0, 10, 1), which is a rather specific output. Before thinking about why this breaks the code, let’s look at what proptest did to arrive at this value. At the start of our test function, insert

    println!("y = {}, m = {}, d = {}", y, m, d);

Running the test again, we get something like this:

y = 2497, m = 8, d = 27
y = 9641, m = 8, d = 18
y = 7360, m = 12, d = 20
y = 3680, m = 12, d = 20
y = 1840, m = 12, d = 20
y = 920, m = 12, d = 20
y = 460, m = 12, d = 20
y = 230, m = 12, d = 20
y = 115, m = 12, d = 20
y = 57, m = 12, d = 20
y = 28, m = 12, d = 20
y = 14, m = 12, d = 20
y = 7, m = 12, d = 20
y = 3, m = 12, d = 20
y = 1, m = 12, d = 20
y = 0, m = 12, d = 20
y = 0, m = 6, d = 20
y = 0, m = 9, d = 20
y = 0, m = 11, d = 20
y = 0, m = 10, d = 20
y = 0, m = 10, d = 10
y = 0, m = 10, d = 5
y = 0, m = 10, d = 3
y = 0, m = 10, d = 2
y = 0, m = 10, d = 1

The test failure message said there were two successful cases; we see these at the very top, 2497-08-27 and 9641-08-18. The next case, 7360-12-20, failed. There’s nothing immediately obviously special about this date. Fortunately, proptest reduced it to a much simpler case. First, it rapidly reduced the y input to 0 at the beginning, and similarly reduced the d input to the minimum allowable value of 1 at the end. Between those two, though, we see something different: it tried to shrink 12 to 6, but then ended up raising it back up to 10. This is because the 0000-06-20 and 0000-09-20 test cases passed.

In the end, we get the date 0000-10-01, which apparently gets parsed as 0000-00-01. Again, this failing case was added to the failure persistence file, and we should add this as its own unit test:

$ git add proptest-regressions
#[test]
fn test_october_first() {
    assert_eq!(Some(0, 10, 1), parse_date("0000-10-01"));
}

Now to figure out what’s broken in the code. Even without the intermediate input, we can say with reasonable confidence that the year and day parts don’t come into the picture since both were reduced to the minimum allowable input. The month input was not, but was reduced to 10. This means we can infer that there’s something special about 10 that doesn’t hold for 9. In this case, that “special something” is being two digits wide. In our code:

    let month = &s[6..7];

We were off by one, and need to use the range 5..7. After fixing this, the test passes.

The proptest! macro has some additional syntax, including for setting configuration for things like the number of test cases to generate. See its documentation for more details.

There is a more in-depth tutorial further down.

Differences between QuickCheck and Proptest

QuickCheck and Proptest are similar in many ways: both generate random inputs for a function to check certain properties, and automatically shrink inputs to minimal failing cases.

The one big difference is that QuickCheck generates and shrinks values based on type alone, whereas Proptest uses explicit Strategy objects. The QuickCheck approach has a lot of disadvantages in comparison:

  • QuickCheck can only define one generator and shrinker per type. If you need a custom generation strategy, you need to wrap it in a newtype and implement traits on that by hand. In Proptest, you can define arbitrarily many different strategies for the same type, and there are plenty built-in.

  • For the same reason, QuickCheck has a single “size” configuration that tries to define the range of values generated. If you need an integer between 0 and 100 and another between 0 and 1000, you probably need to do another newtype. In Proptest, you can directly just express that you want a 0..100 integer and a 0..1000 integer.

  • Types in QuickCheck are not easily composable. Defining Arbitrary and Shrink for a new struct which is simply produced by the composition of its fields requires implementing both by hand, including a bidirectional mapping between the struct and a tuple of its fields. In Proptest, you can make a tuple of the desired components and then prop_map it into the desired form. Shrinking happens automatically in terms of the input types.

  • Because constraints on values cannot be expressed in QuickCheck, generation and shrinking may lead to a lot of input rejections. Strategies in Proptest are aware of simple constraints and do not generate or shrink to values that violate them.

The author of Hypothesis also has an article on this topic.

Of course, there’s also some relative downsides that fall out of what Proptest does differently:

  • Generating complex values in Proptest can be up to an order of magnitude slower than in QuickCheck. This is because QuickCheck performs stateless shrinking based on the output value, whereas Proptest must hold on to all the intermediate states and relationships in order for its richer shrinking model to work.

Limitations of Property Testing

Given infinite time, property testing will eventually explore the whole input space to a test. However, time is not infinite, so only a randomly sampled portion of the input space can be explored. This means that property testing is extremely unlikely to find single-value edge cases in a large space. For example, the following test will virtually always pass:

#[macro_use] extern crate proptest;
use proptest::prelude::*;

proptest! {
    #[test]
    fn i64_abs_is_never_negative(a: i64) {
        // This actually fails if a == i64::MIN, but randomly picking one
        // specific value out of 2⁶⁴ is overwhelmingly unlikely.
        assert!(a.abs() >= 0);
    }
}

Because of this, traditional unit testing with intelligently selected cases is still necessary for many kinds of problems.

Similarly, in some cases it can be hard or impossible to define a strategy which actually produces useful inputs. A strategy of .{1,4096} may be great to fuzz a C parser, but is highly unlikely to produce anything that makes it to a code generator.

Failure Persistence

By default, when Proptest finds a failing test case, it persists that failing case in a file named after the source containing the failing test, but in a separate directory tree rooted at proptest-regressions† . Later runs of tests will replay those test cases before generating novel cases. This ensures that the test will not fail on one run and then spuriously pass on the next, and also exposes similar tests to the same known-problematic input.

(† If you do not have an obvious source directory, you may instead find files next to the source files, with a different extension.)

It is recommended to check these files in to your source control so that other test runners (e.g., collaborators or a CI system) also replay these cases.

Note that, by default, all tests in the same crate will share that one persistence file. If you have a very large number of tests, it may be desirable to separate them into smaller groups so the number of extra test cases that get run is reduced. This can be done by adjusting the failure_persistence flag on Config.

There are two ways this persistence could theoretically be done.

The immediately obvious option is to persist a representation of the value itself, for example by using Serde. While this has some advantages, particularly being resistant to changes like tweaking the input strategy, it also has a lot of problems. Most importantly, there is no way to determine whether any given value is actually within the domain of the strategy that produces it. Thus, some (likely extremely fragile) mechanism to ensure that the strategy that produced the value exactly matches the one in use in a test case would be required.

The other option is to store the seed that was used to produce the failing test case. This approach requires no support from the strategy or the produced value. If the strategy in use differs from the one used to produce failing case that was persisted, the seed may or may not produce the problematic value, but nonetheless produces a valid value. Due to these advantages, this is the approach Proptest uses.

Forking and Timeouts

By default, proptest tests are run in-process and are allowed to run for however long it takes them. This is resource-efficient and produces the nicest test output, and for many use cases is sufficient. However, problems like overflowing the stack, aborting the process, or getting stuck in an infinite will simply break the entire test process and prevent proptest from determining a minimal reproducible case.

As of version 0.7.1, proptest has optional “fork” and “timeout” features (both enabled by default), which make it possible to run your test cases in a subprocess and limit how long they may run. This is generally slower, may make using a debugger more difficult, and makes test output harder to interpret, but allows proptest to find and minimise test cases for these situations as well.

To enable these features, simply set the fork and/or timeout fields on the Config. (Setting timeout implies fork.)

Here is a simple example of using both features:

#[macro_use] extern crate proptest;
use proptest::prelude::*;

// The worst possible way to calculate Fibonacci numbers
fn fib(n: u64) -> u64 {
    if n <= 1 {
        n
    } else {
        fib(n - 1) + fib(n - 2)
    }
}

proptest! {
    #![proptest_config(ProptestConfig {
        // Setting both fork and timeout is redundant since timeout implies
        // fork, but both are shown for clarity.
        fork: true,
        timeout: 1000,
        .. ProptestConfig::default()
    })]

    #[test]
    fn test_fib(n: u64) {
        // For large n, this will variously run for an extremely long time,
        // overflow the stack, or panic due to integer overflow.
        assert!(fib(n) >= n);
    }
}

The exact value of the test failure depends heavily on the performance of the host system, the rust version, and compiler flags, but on the system where it was originally tested, it found that the maximum value that fib() could handle was 39, despite having dozens of processes dump core due to stack overflow or time out along the way.

If you just want to run tests in subprocesses or with a timeout every now and then, you can do that by setting the PROPTEST_FORK or PROPTEST_TIMEOUT environment variables to alter the default configuration. For example, on Unix,

PROPTEST_FORK=true cargo test
PROPTEST_TIMEOUT=1000 cargo test

In-Depth Tutorial

This tutorial will introduce proptest from the bottom up, starting from the basic building blocks, in the hopes of making the model as a whole clear. In particular, we’ll start off without using the macros so that the macros can later be understood in terms of what they expand into rather than magic. But as a result, the first part is not representative of how proptest is normally used. If bottom-up isn’t your style, you may wish to skim the first few sections.

Also note that the examples here focus on the usage of proptest itself, and as such generally have trivial test bodies. In real code, you would obviously have assertions and so forth in the test bodies.

Strategy Basics

The Strategy is the most fundamental concept in proptest. A strategy defines two things:

  • How to generate random values of a particular type from a random number generator.

  • How to “shrink” such values into “simpler” forms.

Proptest ships with a substantial library of strategies. Some of these are defined in terms of built-in types; for example, 0..100i32 is a strategy to generate i32s between 0, inclusive, and 100, exclusive. As we’ve already seen, strings are themselves strategies for generating strings which match the former as a regular expression.

Generating a value is a two-step process. First, a TestRunner is passed to the new_tree() method of the Strategy; this returns a ValueTree, which we’ll look at in more detail momentarily. Calling the current() method on the ValueTree produces the actual value. Knowing that, we can put the pieces together and generate values. The below is the tutoral-strategy-play.rs example:

extern crate proptest;

use proptest::test_runner::TestRunner;
use proptest::strategy::{Strategy, ValueTree};

fn main() {
    let mut runner = TestRunner::default();
    let int_val = (0..100i32).new_tree(&mut runner).unwrap();
    let str_val = "[a-z]{1,4}\\p{Cyrillic}{1,4}\\p{Greek}{1,4}"
        .new_tree(&mut runner).unwrap();
    println!("int_val = {}, str_val = {}",
             int_val.current(), str_val.current());
}

If you run this a few times, you’ll get output similar to the following:

$ target/debug/examples/tutorial-strategy-play
int_val = 99, str_val = vѨͿἕΌ
$ target/debug/examples/tutorial-strategy-play
int_val = 25, str_val = cwᵸійΉ
$ target/debug/examples/tutorial-strategy-play
int_val = 5, str_val = oegiᴫᵸӈᵸὛΉ

This knowledge is sufficient to build an extremely primitive fuzzing test.

extern crate proptest;

use proptest::test_runner::TestRunner;
use proptest::strategy::{Strategy, ValueTree};

fn some_function(v: i32) {
    // Do a bunch of stuff, but crash if v > 500
    assert!(v <= 500);
}

#[test]
fn some_function_doesnt_crash() {
    let mut runner = TestRunner::default();
    for _ in 0..256 {
        let val = (0..10000i32).new_tree(&mut runner).unwrap();
        some_function(val.current());
    }
}

This works, but when the test fails, we don’t get much context, and even if we recover the input, we see some arbitrary-looking value like 1771 rather than the boundary condition of 501. For a function taking just an integer, this is probably still good enough, but as inputs get more complex, interpreting completely random values becomes increasingly difficult.

Shrinking Basics

Finding the “simplest” input that causes a test failure is referred to as shrinking. This is where the intermediate ValueTree type comes in. Besides current(), it provides two methods — simplify() and complicate() — which together allow binary searching over the input space. The tutorial-simplify-play.rs example shows how repeated calls to simplify() produce incrementally “simpler” outputs, both in terms of size and in characters used.

extern crate proptest;

use proptest::test_runner::TestRunner;
use proptest::strategy::{Strategy, ValueTree};

fn main() {
    let mut runner = TestRunner::default();
    let mut str_val = "[a-z]{1,4}\\p{Cyrillic}{1,4}\\p{Greek}{1,4}"
        .new_tree(&mut runner).unwrap();
    println!("str_val = {}", str_val.current());
    while str_val.simplify() {
        println!("        = {}", str_val.current());
    }
}

A couple runs:

$ target/debug/examples/tutorial-simplify-play
str_val = vy꙲ꙈᴫѱΆῨῨ
        = y꙲ꙈᴫѱΆῨῨ
        = y꙲ꙈᴫѱΆῨῨ
        = m꙲ꙈᴫѱΆῨῨ
        = g꙲ꙈᴫѱΆῨῨ
        = d꙲ꙈᴫѱΆῨῨ
        = b꙲ꙈᴫѱΆῨῨ
        = a꙲ꙈᴫѱΆῨῨ
        = aꙈᴫѱΆῨῨ
        = aᴫѱΆῨῨ
        = aѱΆῨῨ
        = aѱΆῨῨ
        = aѱΆῨῨ
        = aиΆῨῨ
        = aМΆῨῨ
        = aЎΆῨῨ
        = aЇΆῨῨ
        = aЃΆῨῨ
        = aЁΆῨῨ
        = aЀΆῨῨ
        = aЀῨῨ
        = aЀῨ
        = aЀῨ
        = aЀῢ
        = aЀ῟
        = aЀ῞
        = aЀ῝
$ target/debug/examples/tutorial-simplify-play
str_val = dyiꙭᾪῇΊ
        = yiꙭᾪῇΊ
        = iꙭᾪῇΊ
        = iꙭᾪῇΊ
        = iꙭᾪῇΊ
        = eꙭᾪῇΊ
        = cꙭᾪῇΊ
        = bꙭᾪῇΊ
        = aꙭᾪῇΊ
        = aꙖᾪῇΊ
        = aꙋᾪῇΊ
        = aꙅᾪῇΊ
        = aꙂᾪῇΊ
        = aꙁᾪῇΊ
        = aꙀᾪῇΊ
        = aꙀῇΊ
        = aꙀΊ
        = aꙀΊ
        = aꙀΊ
        = aꙀΉ
        = aꙀΈ

Note that shrinking never shrinks a value to something outside the range the strategy describes. Notice the strings in the above example still match the regular expression even in the end. An integer drawn from 100..1000i32 will shrink towards zero, but will stop at 100 since that is the minimum value.

simplify() and complicate() can be used to adapt our primitive fuzz test to actually find the boundary condition.

extern crate proptest;

use proptest::test_runner::TestRunner;
use proptest::strategy::{Strategy, ValueTree};

fn some_function(v: i32) -> bool {
    // Do a bunch of stuff, but crash if v > 500
    // assert!(v <= 500);
    // But return a boolean instead of panicking for simplicity
    v <= 500
}

// We know the function is broken, so use a purpose-built main function to
// find the breaking point.
fn main() {
    let mut runner = TestRunner::default();
    for _ in 0..256 {
        let mut val = (0..10000i32).new_tree(&mut runner).unwrap();
        if some_function(val.current()) {
            // Test case passed
            continue;
        }

        // We found our failing test case, simplify it as much as possible.
        loop {
            if !some_function(val.current()) {
                // Still failing, find a simpler case
                if !val.simplify() {
                    // No more simplification possible; we're done
                    break;
                }
            } else {
                // Passed this input, back up a bit
                if !val.complicate() {
                    break;
                }
            }
        }

        println!("The minimal failing case is {}", val.current());
        assert_eq!(501, val.current());
        return;
    }
    panic!("Didn't find a failing test case");
}

This code reliably finds the boundary of the failure, 501.

Using the Test Runner

The above is quite a bit of code though, and it can’t handle things like panics. Fortunately, proptest’s TestRunner provides this functionality for us. The method we’re interested in is run. We simply give it the strategy and a function to test inputs and it takes care of the rest.

extern crate proptest;

use proptest::test_runner::{Config, FileFailurePersistence,
                            TestError, TestRunner};

fn some_function(v: i32) {
    // Do a bunch of stuff, but crash if v > 500.
    // We return to normal `assert!` here since `TestRunner` catches
    // panics.
    assert!(v <= 500);
}

// We know the function is broken, so use a purpose-built main function to
// find the breaking point.
fn main() {
    let mut runner = TestRunner::new(Config {
        // Turn failure persistence off for demonstration
        failure_persistence: Some(Box::new(FileFailurePersistence::Off)),
        .. Config::default()
    });
    let result = runner.run(&(0..10000i32), |v| {
        some_function(v);
        Ok(())
    });
    match result {
        Err(TestError::Fail(_, value)) => {
            println!("Found minimal failing case: {}", value);
            assert_eq!(501, value);
        },
        result => panic!("Unexpected result: {:?}", result),
    }
}

That’s a lot better! Still a bit boilerplatey; the proptest! macro will help with that, but it does some other stuff we haven’t covered yet, so for the moment we’ll keep using TestRunner directly.

Compound Strategies

Testing functions that take single arguments of primitive types is nice and all, but is kind of underwhelming. Back when we were writing the whole stack by hand, extending the technique to, say, two integers was clear, if verbose. But TestRunner only takes a single Strategy; how can we test a function that needs inputs from more than one?

use proptest::test_runner::TestRunner;

fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[test]
fn test_add() {
    let mut runner = TestRunner::default();
    runner.run(/* uhhm... */).unwrap();
}

The key is that strategies are composable. The simplest form of composition is “compound strategies”, where we take multiple strategies and combine their values into one value that holds each input separately. There are several of these. The simplest is a tuple; a tuple of strategies is itself a strategy for tuples of the values those strategies produce. For example, (0..100i32,100..1000i32) is a strategy for pairs of integers where the first value is between 0 and 100 and the second is between 100 and 1000.

So for our two-argument function, our strategy is simply a tuple of ranges.

use proptest::test_runner::TestRunner;

fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[test]
fn test_add() {
    let mut runner = TestRunner::default();
    // Combine our two inputs into a strategy for one tuple. Our test
    // function then destructures the generated tuples back into separate
    // `a` and `b` variables to be passed in to `add()`.
    runner.run(&(0..1000i32, 0..1000i32), |(a, b)| {
        let sum = add(a, b);
        assert!(sum >= a);
        assert!(sum >= b);
        Ok(())
    }).unwrap();
}

Other compound strategies include fixed-sizes arrays of strategies and Vecs of strategies (which produce arrays or Vecs of values parallel to the strategy collection), as well as the various strategies provided in the collection module.

Syntax Sugar: proptest!

Now that we know about compound strategies, we can understand how the proptest! macro works. Our example from the prior section can be rewritten using that macro like so:

#[macro_use] extern crate proptest;

fn add(a: i32, b: i32) -> i32 {
    a + b
}

proptest! {
    #[test]
    fn test_add(a in 0..1000i32, b in 0..1000i32) {
        let sum = add(a, b);
        assert!(sum >= a);
        assert!(sum >= b);
    }
}

Conceptually, the desugaring process is fairly simple. At the start of the test function, a new TestRunner is constructed. The input strategies (after the in keyword) are grouped into a tuple. That tuple is passed in to the TestRunner as the input strategy. The test body has Ok(()) added to the end, then is put into a lambda that destructures the generated input tuple back into the named parameters and then runs the body. The end result is extremely similar to what we wrote by hand in the prior section.

proptest! actually does a few other things in order to make failure output easier to read and to overcome the 10-tuple limit.

Transforming Strategies

Suppose you have a function that takes a string which needs to be the Display format of an arbitrary u32. A first attempt to providing this argument might be to use a regular expression, like so:

#[macro_use] extern crate proptest;

fn do_stuff(v: String) {
    let i: u32 = v.parse().unwrap();
    let s = i.to_string();
    assert_eq!(s, v);
}

proptest! {
    #[test]
    fn test_do_stuff(v in "[1-9][0-9]{0,8}") {
        do_stuff(v);
    }
}

This kind of works, but it has problems. For one, it does not explore the whole u32 space. It is possible to write a regular expression that does, but such an expression is rather long, and also results in a pretty odd distribution of values. The input also doesn’t shrink correctly, since proptest tries to shrink it in terms of a string rather than an integer.

What you really want to do is generate a u32 and then pass in its string representation. One way to do this is to just take u32 as an input to the test and then transform it to a string within the test code. This approach works fine, but isn’t reusable or composable. Ideally, we could get a strategy that does this.

The thing we’re looking for is the first strategy combinator, prop_map. We need to ensure Strategy is in scope to use it.

#[macro_use] extern crate proptest;
// Grab `Strategy` and a shorter namespace prefix
use proptest::prelude::*;

fn do_stuff(v: String) {
    let i: u32 = v.parse().unwrap();
    let s = i.to_string();
    assert_eq!(s, v);
}

proptest! {
    #[test]
    fn test_do_stuff(v in any::<u32>().prop_map(|v| v.to_string())) {
        do_stuff(v);
    }
}

Calling prop_map on a Strategy creates a new strategy which transforms every generated value using the provided function. Proptest retains the relationship between the original Strategy and the transformed one; as a result, shrinking occurs in terms of u32, even though we’re generating a String.

prop_map is also the principal way to define strategies for new types, since most types are simply composed of other, simpler values.

Let’s update our code so it takes a more interesting structure.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

#[derive(Clone, Debug)]
struct Order {
  id: String,
  // Some other fields, though the test doesn't do anything with them
  item: String,
  quantity: u32,
}

fn do_stuff(order: Order) {
    let i: u32 = order.id.parse().unwrap();
    let s = i.to_string();
    assert_eq!(s, order.id);
}

proptest! {
    #[test]
    fn test_do_stuff(
        order in
        (any::<u32>().prop_map(|v| v.to_string()),
         "[a-z]*", 1..1000u32).prop_map(
             |(id, item, quantity)| Order { id, item, quantity })
    ) {
        do_stuff(order);
    }
}

Notice how we were able to take the output from prop_map and put it in a tuple, then call prop_map on that tuple to produce yet another value.

But that’s quite a mouthful in the argument list. Fortunately, strategies are normal values, so we can extract it to a function.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

// snip

fn arb_order(max_quantity: u32) -> BoxedStrategy<Order> {
    (any::<u32>().prop_map(|v| v.to_string()),
     "[a-z]*", 1..max_quantity)
    .prop_map(|(id, item, quantity)| Order { id, item, quantity })
    .boxed()
}

proptest! {
    #[test]
    fn test_do_stuff(order in arb_order(1000)) {
        do_stuff(order);
    }
}

We boxed() the strategy in the function since otherwise the type would not be nameable, and even if it were, it would be very hard to read or write. Boxing a Strategy turns both it and its ValueTrees into trait objects, which both makes the types simpler and can be used to mix heterogeneous Strategy types as long as they produce the same value types.

The arb_order() function is also parameterised, which is another advantage of extracting strategies to separate functions. In this case, if we have a test that needs an Order with no more than a dozen items, we can simply call arb_order(12) rather than needing to write out a whole new strategy.

We can also use -> impl Strategy<Value = Order> instead to avoid the overhead as in the following example. You should use -> impl Strategy<..> unless you need the dynamic dispatch.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

// snip


fn arb_order(max_quantity: u32) -> impl Strategy<Value = Order> {
    (any::<u32>().prop_map(|v| v.to_string()),
     "[a-z]*", 1..max_quantity)
    .prop_map(|(id, item, quantity)| Order { id, item, quantity })
}

proptest! {
    #[test]
    fn test_do_stuff(order in arb_order(1000)) {
        do_stuff(order);
    }
}

Syntax Sugar: prop_compose!

Defining strategy-returning functions like this is extremely useful, but the code above is a bit verbose, as well as hard to read for similar reasons to writing test functions by hand.

To simplify this task, proptest includes the prop_compose! macro. Before going into details, here’s our code from above rewritten to use it.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

// snip

prop_compose! {
    fn arb_order_id()(id in any::<u32>()) -> String {
        id.to_string()
    }
}
prop_compose! {
    fn arb_order(max_quantity: u32)
                (id in arb_order_id(), item in "[a-z]*",
                 quantity in 1..max_quantity)
                -> Order {
        Order { id, item, quantity }
    }
}

proptest! {
    #[test]
    fn test_do_stuff(order in arb_order(1000)) {
        do_stuff(order);
    }
}

We had to extract arb_order_id() out into its own function, but otherwise this desugars to almost exactly what we wrote in the previous section. The generated function takes the first parameter list as arguments. These arguments are used to select the strategies in the second argument list. Values are then drawn from those strategies and transformed by the function body. The actual function has a return type of impl Strategy<Value = T> where T is the declared return type.

Generating Enums

The syntax sugar for defining strategies for enums is currently somewhat limited. Creating such strategies with prop_compose! is possible but generally is not very readable, so in most cases defining the function by hand is preferable.

The core building block is the prop_oneof! macro, in which you list one case for each case in your enum. For enums which have no data, the strategy for each case is Just(YourEnum::TheCase). Enum cases with data generally require putting the data in a tuple and then using prop_map to map it into the enum case.

Here is a simple example:

#[macro_use] extern crate proptest;
use proptest::prelude::*;

#[derive(Debug, Clone)]
enum MyEnum {
    SimpleCase,
    CaseWithSingleDatum(u32),
    CaseWithMultipleData(u32, String),
}

fn my_enum_strategy() -> impl Strategy<Value = MyEnum> {
  prop_oneof![
    // For cases without data, `Just` is all you need
    Just(MyEnum::SimpleCase),

    // For cases with data, write a strategy for the interior data, then
    // map into the actual enum case.
    any::<u32>().prop_map(MyEnum::CaseWithSingleDatum),

    (any::<u32>(), ".*").prop_map(
      |(a, b)| MyEnum::CaseWithMultipleData(a, b)),
  ]
}

In general, it is best to list the enum cases in order from “simplest” to “most complex”, since shrinking will shrink down toward items earlier in the list.

For particularly complex enum cases, it can be helpful to extract the strategy for that case to a separate strategy. Here, prop_compose! can be of use.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

#[derive(Debug, Clone)]
enum MyComplexEnum {
    SimpleCase,
    AnotherSimpleCase,
    ComplexCase {
        product_code: String,
        id: u64,
        chapter: String,
    },
}

prop_compose! {
  fn my_complex_enum_complex_case()(
      product_code in "[0-9A-Z]{10,20}",
      id in 1u64..10000u64,
      chapter in "X{0,2}(V?I{1,3}|IV|IX)",
  ) -> MyComplexEnum {
      MyComplexEnum::ComplexCase { product_code, id, chapter }
  }
}

fn my_enum_strategy() -> BoxedStrategy<MyComplexEnum> {
  prop_oneof![
    Just(MyComplexEnum::SimpleCase),
    Just(MyComplexEnum::AnotherSimpleCase),
    my_complex_enum_complex_case(),
  ].boxed()
}

Filtering

Sometimes, you have a case where your input values have some sort of “irregular” constraint on them. For example, an integer needing to be even, or two values needing to be non-equal.

In general, the ideal solution is to find a way to take a seed value and then use prop_map to transform it into the desired, irregular domain. For example, to generate even integers, use something like

prop_compose! {
    // Generate arbitrary integers up to half the maximum desired value,
    // then multiply them by 2, thus producing only even integers in the
    // desired range.
    fn even_integer(max: i32)(base in 0..max/2) -> i32 { base * 2 }
}

For the cases where this is not viable, it is possible to filter strategies. Proptest actually divides filters into two categories:

  • “Local” filters apply to a single strategy. If a value is rejected, a new value is drawn from that strategy only.

  • “Global” filters apply to the whole test case. If the test case is rejected, the whole thing is regenerated.

The distinction is somewhat arbitrary, since something like a “global filter” could be created by just putting a “local filter” around the whole input strategy. In practise, the distinction is as to what code performs the rejection.

A local filter is created with the prop_filter combinator. Besides a function indicating whether to accept the value, it also takes a value of type &'static str, String, .., which it uses to record where/why the rejection happened.

#[macro_use] extern crate proptest;
use proptest::prelude::*;

proptest! {
    #[test]
    fn some_test(
      v in (0..1000u32)
        .prop_filter("Values must not divisible by 7 xor 11",
                     |v| !((0 == v % 7) ^ (0 == v % 11)))
    ) {
        assert_eq!(0 == v % 7, 0 == v % 11);
    }
}

Global filtering results when a test itself returns Err(TestCaseError::Reject). The prop_assume! macro provides an easy way to do this.

#[macro_use] extern crate proptest;

fn frob(a: i32, b: i32) -> (i32, i32) {
    let d = (a - b).abs();
    (a / d, b / d)
}

proptest! {
    #[test]
    fn test_frob(a in -1000..1000, b in -1000..1000) {
        // Input illegal if a==b.
        // Equivalent to
        // if (a == b) { return Err(TestCaseError::Reject(...)); }
        prop_assume!(a != b);

        let (a2, b2) = frob(a, b);
        assert!(a2.abs() <= a.abs());
        assert!(b2.abs() <= b.abs());
    }
}

While useful, filtering has a lot of disadvantages:

  • Since it is simply rejection sampling, it will slow down generation of test cases since values need to be generated additional times to satisfy the filter. In the case where a filter always returns false, a test could theoretically never generate a result.

  • Proptest tracks how many local and global rejections have happened, and aborts if they exceed a certain number. This prevents a test taking an extremely long time due to rejections, but means not all filters are viable in the default configuration. The limits for local and global rejections are different; by default, proptest allows a large number of local rejections but a fairly small number of global rejections, on the premise that the former are cheap but potentially common (having been built into the strategy) but the latter are expensive but rare (being an edge case in the particular test).

  • Shrinking and filtering do not play well together. When shrinking, if a value winds up being rejected, there is no pass/fail information to continue shrinking properly. Instead, proptest treats such a rejection the same way it handles a shrink that results in a passing test: by backing away from simplification with a call to complicate(). Thus encountering a filter rejection during shrinking prevents shrinking from continuing to any simpler values, even if there are some that would be accepted by the filter.

Generating Recursive Data

Randomly generating recursive data structures is trickier than it sounds. For example, the below is a naïve attempt at generating a JSON AST by using recursion. This also uses the prop_oneof!, which we haven’t seen yet but should be self-explanatory.

#[macro_use] extern crate proptest;

use std::collections::HashMap;
use proptest::prelude::*;

#[derive(Clone, Debug)]
enum Json {
    Null,
    Bool(bool),
    Number(f64),
    String(String),
    Array(Vec<Json>),
    Map(HashMap<String, Json>),
}

fn arb_json() -> impl Strategy<Value = Json> {
    prop_oneof![
        Just(Json::Null),
        any::<bool>().prop_map(Json::Bool),
        any::<f64>().prop_map(Json::Number),
        ".*".prop_map(Json::String),
        prop::collection::vec(arb_json(), 0..10).prop_map(Json::Array),
        prop::collection::hash_map(
          ".*", arb_json(), 0..10).prop_map(Json::Map),
    ]
}

Upon closer consideration, this obviously can’t work because arb_json() recurses unconditionally.

A more sophisticated attempt is to define one strategy for each level of nesting up to some maximum. This doesn’t overflow the stack, but as defined here, even four levels of nesting will produce trees with thousands of nodes; by eight levels, we get to tens of millions.

Proptest provides a more reliable solution in the form of the prop_recursive combinator. To use this, we create a strategy for the non-recursive case, then give the combinator that strategy, some size parameters, and a function to transform a nested strategy into a recursive strategy.

#[macro_use] extern crate proptest;

use std::collections::HashMap;
use proptest::prelude::*;

#[derive(Clone, Debug)]
enum Json {
    Null,
    Bool(bool),
    Number(f64),
    String(String),
    Array(Vec<Json>),
    Map(HashMap<String, Json>),
}

fn arb_json() -> impl Strategy<Value = Json> {
    let leaf = prop_oneof![
        Just(Json::Null),
        any::<bool>().prop_map(Json::Bool),
        any::<f64>().prop_map(Json::Number),
        ".*".prop_map(Json::String),
    ];
    leaf.prop_recursive(
      8, // 8 levels deep
      256, // Shoot for maximum size of 256 nodes
      10, // We put up to 10 items per collection
      |inner| prop_oneof![
          // Take the inner strategy and make the two recursive cases.
          prop::collection::vec(inner.clone(), 0..10)
              .prop_map(Json::Array),
          prop::collection::hash_map(".*", inner, 0..10)
              .prop_map(Json::Map),
      ])
}

Higher-Order Strategies

A higher-order strategy is a strategy which is generated by another strategy. That sounds kind of scary, so let’s consider an example first.

Say you have a function you want to test that takes a slice and an index into that slice. If we use a fixed size for the slice, it’s easy, but maybe we need to test with different slice sizes. We could try something with a filter:

fn some_function(stuff: &[String], index: usize) { /* do stuff */ }

proptest! {
    #[test]
    fn test_some_function(
        stuff in prop::collection::vec(".*", 1..100),
        index in 0..100usize
    ) {
        prop_assume!(index < stuff.len());
        some_function(stuff, index);
    }
}

This doesn’t work very well. First off, you get a lot of global rejections since index will be outside of stuff 50% of the time. But secondly, it will be rare to actually get a small stuff vector, since it would have to randomly choose a small index at the same time.

The solution is the prop_flat_map combinator. This is sort of like prop_map, except that the transform returns a strategy instead of a value. This is more easily understood by implementing our example:

#[macro_use] extern crate proptest;
use proptest::prelude::*;

fn some_function(stuff: Vec<String>, index: usize) {
    let _ = &stuff[index];
    // Do stuff
}

fn vec_and_index() -> impl Strategy<Value = (Vec<String>, usize)> {
    prop::collection::vec(".*", 1..100)
        .prop_flat_map(|vec| {
            let len = vec.len();
            (Just(vec), 0..len)
        })
}

proptest! {
    #[test]
    fn test_some_function((vec, index) in vec_and_index()) {
        some_function(vec, index);
    }
}

In vec_and_index(), we make a strategy to produce an arbitrary vector. But then we derive a new strategy based on values produced by the first one. The new strategy produces the generated vector unchanged, but also adds a valid index into that vector, which we can do by picking the strategy for that index based on the size of the vector.

Even though the new strategy specifies the singleton Just(vec) strategy for the vector, proptest still understands the connection to the original strategy and will shrink vec as well. All the while, index continues to be a valid index into vec.

prop_compose! actually allows making second-order strategies like this by simply providing three argument lists instead of two. The below desugars to something much like what we wrote by hand above, except that the index and vector’s positions are internally reversed due to borrowing limitations.

prop_compose! {
    fn vec_and_index()(vec in prop::collection::vec(".*", 1..100))
                    (index in 0..vec.len(), vec in Just(vec))
                    -> (Vec<String>, usize) {
       (vec, index)
   }
}

Defining a canonical Strategy for a type

We previously used the function any as in any::<u32>() to generate a strategy for all u32s. This function works with the trait Arbitrary, which QuickCheck users may be familiar with. In proptest, this trait is already implemented for most owned types in the standard library, but you can of course implement it for your own types.

In some cases, where it makes sense to define a canonical strategy, such as in the JSON AST example, it is a good idea to implement Arbitrary.

Stay tuned for more information about this. Soon you will be able to derive Arbitrary for a lot of cases.

Configuring the number of tests cases requried

The default number of successful test cases that must execute for a test as a whole to pass is currently 256. If you are not satisfied with this and want to run more or fewer, there are a few ways to do this.

The first way is to set the environment-variable PROPTEST_CASES to a value that can be successfully parsed as a u32. The value you set to this variable is now the new default.

Another way is to use #![proptest_config(expr)] inside proptest! where expr : Config. To only change the number of test cases, you can simply write:

#[macro_use] extern crate proptest;
use proptest::test_runner::Config;

fn add(a: i32, b: i32) -> i32 { a + b }

proptest! {
    // The next line modifies the number of tests.
    #![proptest_config(Config::with_cases(1000))]
    #[test]
    fn test_add(a in 0..1000i32, b in 0..1000i32) {
        let sum = add(a, b);
        assert!(sum >= a);
        assert!(sum >= b);
    }
}

Through the same proptest_config mechanism you may fine-tune your configuration through the Config type. See its documentation for more information.

Conclusion

That’s it for the tutorial, at least for now. There are more details for the features discussed above on their individual documentation pages, and you can find out about all the strategies provided out-of-the-box by perusing the module tree below.

Modules

Defines the Arbitrary trait and related free functions and type aliases. See the trait for more information.
Support for strategies producing fixed-length arrays.
Strategies for working with bit sets.
Strategies for generating bool values.
Strategies for generating char values.
Strategies for generating std::collections of values.
Strategies to generate numeric values (as opposed to integers used as bit fields).
Strategies for generating std::Option values.
Re-exports the most commonly-needed APIs of proptest.
Strategies for combining delegate strategies into std::Results.
Strategies for generating values by taking samples of collections.
Defines the core traits used by Proptest.
Strategies for generating strings and byte strings from regular expressions.
State and functions for running proptest tests.
Support for combining strategies into tuples.

Macros

Similar to assert! from std, but returns a test failure instead of panicking if the condition fails.
Similar to assert_eq! from std, but returns a test failure instead of panicking if the condition fails.
Similar to assert_ne! from std, but returns a test failure instead of panicking if the condition fails.
Rejects the test input if assumptions are not met.
Convenience to define functions which produce new strategies.
Produce a strategy which picks one of the listed choices.
Easily define proptest tests.